Data science, Signal analysis

Mathematics for data science and signal processing is a theme which runs accross the activities of the Institute’s members.

In this field, the methods studied and the choice of mathematical tools are motivated by the study of statistical and deterministic signal and image processing problems (2D and 3D); processing of structured data in the form of vectors, matrices, or temporal data. The data is often massive, high-dimensional, heterogeneous and potentially imperfectly collected.

The plurality of viewpoints present at the Institute allows us to address all the mathematical aspects of the problems studied:

  • their modeling ;

  • obtaining theoretical guarantees on the performance of a method ;

  • the numerical resolution of the mathematical problem and the analysis of the performance of the numerical methods;

  • the application of methods to concrete problems, in interaction with collaborators from other scientific disciplines (medicine, biology, environmental sciences, meteorology, physics of matter, epidemiology, etc.) and from industry.

Examples include image reconstruction in biomedical imaging, gene expression data analysis, mathematical modeling of geophysical fluid flows and their applications in hydrology and glaciology, flood and coastal inundation prediction models, and the improvement of regional weather forecasts for better anticipation of extreme events. These works are carried out in collaboration with various partners, including CNES, OMP, SHOM, Météo-France, etc.

For these various applications, we process satellite and spatial data, data measuring geophysical fluid flows, data from the analysis of calculation codes, robotics, predictive maintenance, different data from industry, functional data, high-throughput biological data, omics data (genomics, transcriptomics, proteomics, metabolomics, etc.), observational data from patient follow-ups, as well as medical images, images from microscopes or hyperspectral images.

These issues are addressed and the approaches are validated using a wide range of mathematical objects and tools. Without being exhaustive, we can mention : neural networks, physics-inspired learning methods, partial differential equations, Gaussian processes, sensitivity analysis, inverse problems, data assimilation, parsimonious models, optimal transport, geometric approaches to statistics, extreme values.

Solutions are computed using the methods best suited to the problem. Here again, the panel is very large and the members of the institute work on fast numerical methods, on efficient simulation methods, on convex, non-convex, non-smooth optimization problems, on stochastic optimization, on on-line, distributed or incremental processing. Codes and software solutions are developed, mostly as open source.